average error
ArtReg: Visuo-Tactile based Pose Tracking and Manipulation of Unseen Articulated Objects
Murali, Prajval Kumar, Kaboli, Mohsen
Robots operating in real-world environments frequently encounter unknown objects with complex structures and articulated components, such as doors, drawers, cabinets, and tools. The ability to perceive, track, and manipulate these objects without prior knowledge of their geometry or kinematic properties remains a fundamental challenge in robotics. In this work, we present a novel method for visuo-tactile-based tracking of unseen objects (single, multiple, or articulated) during robotic interaction without assuming any prior knowledge regarding object shape or dynamics. Our novel pose tracking approach termed ArtReg (stands for Articulated Registration) integrates visuo-tactile point clouds in an unscented Kalman Filter formulation in the SE(3) Lie Group for point cloud registration. ArtReg is used to detect possible articulated joints in objects using purposeful manipulation maneuvers such as pushing or hold-pulling with a two-robot team. Furthermore, we leverage ArtReg to develop a closed-loop controller for goal-driven manipulation of articulated objects to move the object into the desired pose configuration. We have extensively evaluated our approach on various types of unknown objects through real robot experiments. We also demonstrate the robustness of our method by evaluating objects with varying center of mass, low-light conditions, and with challenging visual backgrounds. Furthermore, we benchmarked our approach on a standard dataset of articulated objects and demonstrated improved performance in terms of pose accuracy compared to state-of-the-art methods. Our experiments indicate that robust and accurate pose tracking leveraging visuo-tactile information enables robots to perceive and interact with unseen complex articulated objects (with revolute or prismatic joints).
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper presented a new method for spike-sorting, although the method itself is general and not limited to spike-sorting, and was presented in a generic way. The authors introduced two new ideas. First is to use SVD to define basis functions and compute precise spike times by interpolation, which is I think a very neat idea. To this end, Taylor or polar interpolators are previously used (Ekanadham et al., 2011), but the authors pointed out that SVD-based method is theoretically optimal in MSE sense.
Enhanced Vehicle Speed Detection Considering Lane Recognition Using Drone Videos in California
Naeini, Amirali Ataee, Teymouri, Ashkan, Jafarsalehi, Ghazaleh, Zhang, Michael
The increase in vehicle numbers in California, driven by inadequate transportation systems and sparse speed cameras, necessitates effective vehicle speed detection. Detecting vehicle speeds per lane is critical for monitoring High-Occupancy Vehicle (HOV) lane speeds, distinguishing between cars and heavy vehicles with differing speed limits, and enforcing lane restrictions for heavy vehicles. While prior works utilized YOLO (You Only Look Once) for vehicle speed detection, they often lacked accuracy, failed to identify vehicle lanes, and offered limited or less practical classification categories. This study introduces a fine-tuned YOLOv11 model, trained on almost 800 bird's-eye view images, to enhance vehicle speed detection accuracy which is much higher compare to the previous works. The proposed system identifies the lane for each vehicle and classifies vehicles into two categories: cars and heavy vehicles. Designed to meet the specific requirements of traffic monitoring and regulation, the model also evaluates the effects of factors such as drone height, distance of Region of Interest (ROI), and vehicle speed on detection accuracy and speed measurement. Drone footage collected from Northern California was used to assess the proposed system. The fine-tuned YOLOv11 achieved its best performance with a mean absolute error (MAE) of 0.97 mph and mean squared error (MSE) of 0.94 $\text{mph}^2$, demonstrating its efficacy in addressing challenges in vehicle speed detection and classification.
Indoor Air Quality Detection Robot Model Based on the Internet of Things (IoT)
Simamora, Anggiat Mora, Denih, Asep, Suriansyah, Mohamad Iqbal
This paper presents the design, implementation, and evaluation of an IoT-based robotic system for mapping and monitoring indoor air quality. The primary objective was to develop a mobile robot capable of autonomously mapping a closed environment, detecting concentrations of CO$_2$, volatile organic compounds (VOCs), smoke, temperature, and humidity, and transmitting real-time data to a web interface. The system integrates a set of sensors (SGP30, MQ-2, DHT11, VL53L0X, MPU6050) with an ESP32 microcontroller. It employs a mapping algorithm for spatial data acquisition and utilizes a Mamdani fuzzy logic system for air quality classification. Empirical tests in a model room demonstrated average localization errors below $5\%$, actuator motion errors under $2\%$, and sensor measurement errors within $12\%$ across all modalities. The contributions of this work include: (1) a low-cost, integrated IoT robotic platform for simultaneous mapping and air quality detection; (2) a web-based user interface for real-time visualization and control; and (3) validation of system accuracy under laboratory conditions.
Optimal starting point for time series forecasting
Zhong, Yiming, Ren, Yinuo, Cao, Guangyao, Li, Feng, Qi, Haobo
Recent advances on time series forecasting mainly focus on improving the forecasting models themselves. However, managing the length of the input data can also significantly enhance prediction performance. In this paper, we introduce a novel approach called Optimal Starting Point Time Series Forecast (OSP-TSP) to capture the intrinsic characteristics of time series data. By adjusting the sequence length via leveraging the XGBoost and LightGBM models, the proposed approach can determine optimal starting point (OSP) of the time series and thus enhance the prediction performances. The performances of the OSP-TSP approach are then evaluated across various frequencies on the M4 dataset and other real-world datasets. Empirical results indicate that predictions based on the OSP-TSP approach consistently outperform those using the complete dataset. Moreover, recognizing the necessity of sufficient data to effectively train models for OSP identification, we further propose targeted solutions to address the issue of data insufficiency.
UniTTA: Unified Benchmark and Versatile Framework Towards Realistic Test-Time Adaptation
Du, Chaoqun, Wang, Yulin, Guo, Jiayi, Han, Yizeng, Zhou, Jie, Huang, Gao
Test-Time Adaptation (TTA) aims to adapt pre-trained models to the target domain during testing. In reality, this adaptability can be influenced by multiple factors. Researchers have identified various challenging scenarios and developed diverse methods to address these challenges, such as dealing with continual domain shifts, mixed domains, and temporally correlated or imbalanced class distributions. Despite these efforts, a unified and comprehensive benchmark has yet to be established. To this end, we propose a Unified Test-Time Adaptation (UniTTA) benchmark, which is comprehensive and widely applicable. Each scenario within the benchmark is fully described by a Markov state transition matrix for sampling from the original dataset. The UniTTA benchmark considers both domain and class as two independent dimensions of data and addresses various combinations of imbalance/balance and i.i.d./non-i.i.d./continual conditions, covering a total of \( (2 \times 3)^2 = 36 \) scenarios. It establishes a comprehensive evaluation benchmark for realistic TTA and provides a guideline for practitioners to select the most suitable TTA method. Alongside this benchmark, we propose a versatile UniTTA framework, which includes a Balanced Domain Normalization (BDN) layer and a COrrelated Feature Adaptation (COFA) method--designed to mitigate distribution gaps in domain and class, respectively. Extensive experiments demonstrate that our UniTTA framework excels within the UniTTA benchmark and achieves state-of-the-art performance on average. Our code is available at \url{https://github.com/LeapLabTHU/UniTTA}.
Using Low-Discrepancy Points for Data Compression in Machine Learning: An Experimental Comparison
Gรถttlich, Simone, Heieck, Jacob, Neuenkirch, Andreas
Low-discrepancy points (also called Quasi-Monte Carlo points) are deterministically and cleverly chosen point sets in the unit cube, which provide an approximation of the uniform distribution. We explore two methods based on such low-discrepancy points to reduce large data sets in order to train neural networks. The first one is the method of Dick and Feischl [4], which relies on digital nets and an averaging procedure. Motivated by our experimental findings, we construct a second method, which again uses digital nets, but Voronoi clustering instead of averaging. Both methods are compared to the supercompress approach of [14], which is a variant of the K-means clustering algorithm. The comparison is done in terms of the compression error for different objective functions and the accuracy of the training of a neural network.
Continual Learning for Robust Gate Detection under Dynamic Lighting in Autonomous Drone Racing
Qiao, Zhongzheng, Pham, Xuan Huy, Ramasamy, Savitha, Jiang, Xudong, Kayacan, Erdal, Sarabakha, Andriy
In autonomous and mobile robotics, a principal challenge is resilient real-time environmental perception, particularly in situations characterized by unknown and dynamic elements, as exemplified in the context of autonomous drone racing. This study introduces a perception technique for detecting drone racing gates under illumination variations, which is common during high-speed drone flights. The proposed technique relies upon a lightweight neural network backbone augmented with capabilities for continual learning. The envisaged approach amalgamates predictions of the gates' positional coordinates, distance, and orientation, encapsulating them into a cohesive pose tuple. A comprehensive number of tests serve to underscore the efficacy of this approach in confronting diverse and challenging scenarios, specifically those involving variable lighting conditions. The proposed methodology exhibits notable robustness in the face of illumination variations, thereby substantiating its effectiveness.
Implementation and Evaluation of a Gradient Descent-Trained Defensible Blackboard Architecture System
Milbrath, Jordan, Rivard, Jonathan, Straub, Jeremy
A variety of forms of artificial intelligence systems have been developed. Two well-known techniques are neural networks and rule-fact expert systems. The former can be trained from presented data while the latter is typically developed by human domain experts. A combined implementation that uses gradient descent to train a rule-fact expert system has been previously proposed. A related system type, the Blackboard Architecture, adds an actualization capability to expert systems. This paper proposes and evaluates the incorporation of a defensible-style gradient descent training capability into the Blackboard Architecture. It also introduces the use of activation functions for defensible artificial intelligence systems and implements and evaluates a new best path-based training algorithm.